15 research outputs found

    Punny Captions: Witty Wordplay in Image Descriptions

    Full text link
    Wit is a form of rich interaction that is often grounded in a specific situation (e.g., a comment in response to an event). In this work, we attempt to build computational models that can produce witty descriptions for a given image. Inspired by a cognitive account of humor appreciation, we employ linguistic wordplay, specifically puns, in image descriptions. We develop two approaches which involve retrieving witty descriptions for a given image from a large corpus of sentences, or generating them via an encoder-decoder neural network architecture. We compare our approach against meaningful baseline approaches via human studies and show substantial improvements. We find that when a human is subject to similar constraints as the model regarding word usage and style, people vote the image descriptions generated by our model to be slightly wittier than human-written witty descriptions. Unsurprisingly, humans are almost always wittier than the model when they are free to choose the vocabulary, style, etc.Comment: NAACL 2018 (11 pages

    Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings

    Full text link
    Generalizing deep neural networks to new target domains is critical to their real-world utility. In practice, it may be feasible to get some target data labeled, but to be cost-effective it is desirable to select a maximally-informative subset via active learning (AL). We study the problem of AL under a domain shift, called Active Domain Adaptation (Active DA). We empirically demonstrate how existing AL approaches based solely on model uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm, Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings (ADA-CLUE), i) identifies target instances for labeling that are both uncertain under the model and diverse in feature space, and ii) leverages the available source and target data for adaptation by optimizing a semi-supervised adversarial entropy loss that is complementary to our active sampling objective. On standard image classification-based domain adaptation benchmarks, ADA-CLUE consistently outperforms competing active adaptation, active learning, and domain adaptation methods across domain shifts of varying severity

    Evaluating Visual Conversational Agents via Cooperative Human-AI Games

    Full text link
    As AI continues to advance, human-AI teams are inevitable. However, progress in AI is routinely measured in isolation, without a human in the loop. It is crucial to benchmark progress in AI, not just in isolation, but also in terms of how it translates to helping humans perform certain tasks, i.e., the performance of human-AI teams. In this work, we design a cooperative game - GuessWhich - to measure human-AI team performance in the specific context of the AI being a visual conversational agent. GuessWhich involves live interaction between the human and the AI. The AI, which we call ALICE, is provided an image which is unseen by the human. Following a brief description of the image, the human questions ALICE about this secret image to identify it from a fixed pool of images. We measure performance of the human-ALICE team by the number of guesses it takes the human to correctly identify the secret image after a fixed number of dialog rounds with ALICE. We compare performance of the human-ALICE teams for two versions of ALICE. Our human studies suggest a counterintuitive trend - that while AI literature shows that one version outperforms the other when paired with an AI questioner bot, we find that this improvement in AI-AI performance does not translate to improved human-AI performance. This suggests a mismatch between benchmarking of AI in isolation and in the context of human-AI teams.Comment: HCOMP 201

    Prognosis Following Surgery for Recurrent Ovarian Cancer and Diagnostic Criteria Predictive of Cytoreduction Success: A Systematic Review and Meta-Analysis

    Get PDF
    For women achieving clinical remission after the completion of initial treatment for epithelial ovarian cancer, 80% with advanced-stage disease will develop recurrence. However, the standard treatment of women with recurrent platinum-sensitive diseases remains poorly defined. Secondary (SCS), tertiary (TCS) or quaternary (QCS) cytoreduction surgery for recurrence has been suggested to be associated with increased overall survival (OS). We searched five databases for studies reporting death rate, OS, cytoreduction rates, post-operative morbidity/mortality and diagnostic models predicting complete cytoreduction in a platinum-sensitive disease recurrence setting. Death rates calculated from raw data were pooled based on a random-effects model. Meta-regression/linear regression was performed to explore the role of complete or optimal cytoreduction as a moderator. Pooled death rates were 45%, 51%, 66% for SCS, TCS and QCS, respectively. Median OS for optimal cytoreduction ranged from 16–91, 24–99 and 39–135 months for SCS, TCS and QCS, respectively. Every 10% increase in complete cytoreduction rates at SCS corresponds to a 7% increase in median OS. Complete cytoreduction rates ranged from 9–100%, 35–90% and 33–100% for SCS, TCS and QCS, respectively. Major post-operative thirty-day morbidity was reported to range from 0–47%, 13–33% and 15–29% for SCS, TCS and QCS, respectively. Thirty-day post-operative mortality was 0–6%, 0–3% and 0–2% for SCS, TCS and QCS, respectively. There were two externally validated diagnostic models predicting complete cytoreduction at SCS, but none for TCS and QCS. In conclusion, our data confirm that maximal effort higher order cytoreductive surgery resulting in complete cytoreduction can improve survival

    Implementation of Multigene Germline and Parallel Somatic Genetic Testing in Epithelial Ovarian Cancer: SIGNPOST Study

    Get PDF
    We present findings of a cancer multidisciplinary-team (MDT) coordinated mainstreaming pathway of unselected 5-panel germline BRCA1/BRCA2/RAD51C/RAD51D/BRIP1 and parallel somatic BRCA1/BRCA2 testing in all women with epithelial-OC and highlight the discordance between germline and somatic testing strategies across two cancer centres. Patients were counselled and consented by a cancer MDT member. The uptake of parallel multi-gene germline and somatic testing was 97.7%. Counselling by clinical-nurse-specialist more frequently needed >1 consultation (53.6% (30/56)) compared to a medical (15.0% (21/137)) or surgical oncologist (15.3% (17/110)) (p 0.001). The median age was 54 (IQR = 51–62) years in germline pathogenic-variant (PV) versus 61 (IQR = 51–71) in BRCA wild-type (p = 0.001). There was no significant difference in distribution of PVs by ethnicity, stage, surgery timing or resection status. A total of 15.5% germline and 7.8% somatic BRCA1/BRCA2 PVs were identified. A total of 2.3% patients had RAD51C/RAD51D/BRIP1 PVs. A total of 11% germline PVs were large-genomic-rearrangements and missed by somatic testing. A total of 20% germline PVs are missed by somatic first BRCA-testing approach and 55.6% germline PVs missed by family history ascertainment. The somatic testing failure rate is higher (23%) for patients undergoing diagnostic biopsies. Our findings favour a prospective parallel somatic and germline panel testing approach as a clinically efficient strategy to maximise variant identification. UK Genomics test-directory criteria should be expanded to include a panel of OC genes.Peer reviewe

    Prognosis Following Surgery for Recurrent Ovarian Cancer and Diagnostic Criteria Predictive of Cytoreduction Success: A Systematic Review and Meta-Analysis

    Get PDF
    For women achieving clinical remission after the completion of initial treatment for epithelial ovarian cancer, 80% with advanced-stage disease will develop recurrence. However, the standard treatment of women with recurrent platinum-sensitive diseases remains poorly defined. Secondary (SCS), tertiary (TCS) or quaternary (QCS) cytoreduction surgery for recurrence has been suggested to be associated with increased overall survival (OS). We searched five databases for studies reporting death rate, OS, cytoreduction rates, post-operative morbidity/mortality and diagnostic models predicting complete cytoreduction in a platinum-sensitive disease recurrence setting. Death rates calculated from raw data were pooled based on a random-effects model. Meta-regression/linear regression was performed to explore the role of complete or optimal cytoreduction as a moderator. Pooled death rates were 45%, 51%, 66% for SCS, TCS and QCS, respectively. Median OS for optimal cytoreduction ranged from 16–91, 24–99 and 39–135 months for SCS, TCS and QCS, respectively. Every 10% increase in complete cytoreduction rates at SCS corresponds to a 7% increase in median OS. Complete cytoreduction rates ranged from 9–100%, 35–90% and 33–100% for SCS, TCS and QCS, respectively. Major post-operative thirty-day morbidity was reported to range from 0–47%, 13–33% and 15–29% for SCS, TCS and QCS, respectively. Thirty-day post-operative mortality was 0–6%, 0–3% and 0–2% for SCS, TCS and QCS, respectively. There were two externally validated diagnostic models predicting complete cytoreduction at SCS, but none for TCS and QCS. In conclusion, our data confirm that maximal effort higher order cytoreductive surgery resulting in complete cytoreduction can improve survival.</jats:p

    Towards natural human-AI interactions in vision and language

    Get PDF
    Inter-human interaction is a rich form of communication. Human interactions typically leverage a good theory of mind, involve pragmatics, story-telling, humor, sarcasm, empathy, sympathy, etc. Recently, we have seen a tremendous increase in the frequency and the modalities through which humans interact with AI. Despite this, current human-AI interactions lack many of these features that characterize inter-human interactions. Towards the goal of developing AI that can interact with humans naturally (similar to other humans), I take a two-pronged approach that involves investigating the ways in which both the AI and the human can adapt to each other's characteristics and capabilities. In my research, I study aspects of human interactions, such as humor, story-telling, and the humans' abilities to understand and collaborate with an AI. Specifically, in the vision and language modalities, 1. In an effort to improve the AI's capabilities to adapt its interactions to a human, we build computational models for (i) humor manifested in static images, (ii) contextual, multi-modal humor, and (iii) temporal understanding of the elements of a story. 2. In an effort to improve the capabilities of a collaborative human-AI team, we study (i) a lay person's predictions regarding the behavior of an AI in a situation, (ii) the extent to which interpretable explanations from an AI can improve performance of a human-AI team. Through this work, I demonstrate that aspects of human interactions (such as certain forms of humor and story-telling) can be modeled with reasonable success using computational models that utilize neural networks. On the other hand, I also show that a lay person can successfully predict the outputs and failures of a deep neural network. Finally, I present evidence that suggests that a lay person who has access to interpretable explanations from the model, can collaborate more effectively with a neural network on a goal-driven task.Ph.D
    corecore